RECOMBINATION AND BITSETS 

JOSE RODRIGUEZ, F. B. CHRISTIANSEN*& H. F. HOENIGSBERGt 

A bitset is a set that encodes for a binary number. Bitsets are at the basis of 
a beautiful theory of recombination with n-loci and here we begin from scratch 
and advance to include the derivation of the fundamental results about the 
evolution of gamete frequencies and of disequilibrium measures with and without 
migration. All techniques have been illustrated and we have invested moreover 
a great effort to make the mathematics of this work accessible even for students 
in their first year at the university. 

1 RECOMBINANT OPERATORS 

Amphimictic diploid organisms are those that develop from a single cell which 
results from the fusion of paternal haploid spermatozoa and maternal haploid 
ovule. Each cell in the resulting organism will have approximately the same ge- 
netic material, although as cellular subdivisions run, especially those at gametic 
meiosis, linkage can change and the individual will produce gametes with mixed 
maternal and paternal genomes. The process leading to this ordered mixing is 
called recombination. 

Diploidiness offers at least two possible evolutionary phenomena: 1) It allows 
for two versions of given genetic information, each one of them can be fitted to 
different environments. In this case, recombination emerges as a process that 
puts together harmonious gene complexes in different arrangements. 2) Given 
that reproduction occasionally involves useless errors in replication and that 
these errors are few, recombination can be viewed as a mechanism that either 
repairs by reestablishing adaptive combinations or that puts together potentially 
harmonious genes lying in different gametes. In fact, this is MuUer's theory of 
a "higher evolutionary rate" for sexual (amphimyxis) over asexual (partheno- 
genesis) species. However, Muller's theory does not consider small population 
sizes. The advantage of sexual diploidy, because of its recombinational repairing 
of DNA machinery, disappears in small populations because new mutants will, 
through statistics, either become fixed or disappear before they can recombine 
profitably. 

Let us consider loci from 1 to n, each one with two alleles or versions noted 
and 1. Then, a gamete is represented by a binary number, say, for n=4, the 

*Bioinformatics Research Center, University of Aarhus, Denmark. 

t Institute de Genetica Evolutiva y Biologia Molecular & Institute de Genetica Ecologica 
y Biodiversidad del Tropico Americano. Bogota D.C., Colombia. 



binary numbers 1010, 0001 and 1101 represent three different gametes. Recom- 
bination can be modeled by a relation F so that each ordered pair of gametes 
(the first place for the maternal gamete and the second for the paternal one 
that fused into the original zygote) impUcs a third gamete constructed with the 
alleles of the ordered pair. So, F(1010,0001) can be 1010 or 0001 or 1000 or 
1001 or 0010 but cannot be 1100. 

Since F is not univalued, a set of functions can be defined by introducing 
the concept of recombinant operator Fq, where G is a binary number, and that 
operates over a zygote (Gi = maternal, G2 = paternal), such that Fa{Gi,G2) 
is a third gamete that has at a given locus the maternal allele whenever G has 
a 1 or the paternal one if G has a 0. For example, Fiqii(1001, 0011) = 1001; 
Fiooi (1001, 0011) = 1011; Fqoii (1001, 1100) = 1101 (see figure 1 ). 




Figure 1. Schematic representation of recombination: two initial gametes 

interact to form an exited supergamete that can split into resultant gametes. 
Output and initial gametes are different form,s of information packing. 

Formally, let Gi and G2 be two gametes and G a binary number that rep- 
resents a recombinant operator, i.e., G = X)ai2*~^, Gi = 6,2'^^, G2 = 
^Ci2*~^ where sums extend from 1 to n. Then Fg{Gi,G2) = ^(ii2'^^ wliere 
di = 6, if ttj = 1 or di = Ci ii ai = 0, Coefficients ai,bi,Ci, di are or 1 for each 
i from 1 to n. That gametes and recombinant operators are in a one to one 
correspondence can be realized from the following identity: 

Fg{ONE,ZERO) =G (1) 

Where ONE stands for the gamete that has allele number 1 at all loci, 
and similarly, ZERO stands for the gamete which has allele at all loci. For 
instance, Fnoi (1111, 0000) = 1101. By this reason, we are allowed to confound 
a recombinant operator with a gamete. 

To construct mathematical models it is necessary to assign to each recom- 
binant operator Fq, which is an abstract concept meaning recombinant power, 
a probability R{Fg). Thanks to equation (1) it is possible to identify this 
abstract concept with measurable items. Thcni, it is possible to use the frequen- 
cies R{G) of each output gamete G as estimators of the probabilities R{Fg), 
of the corresponding recombinant operator Fq. The distribution of R{G) on 
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gametes, rather than on recombinant operators, will be referred to as Geirenger 
recombination distribution (Christiansen, 1987). 

Physically, recombination is understood as an exchange of information be- 
tween a pair of gametes, therefore, it would be more natural to represent the 
result not just as one gamete, but rather as two, say Fg{G\,G2) = {03,04) 
with the property that Gi + G2 = G3 + G4 where the binary sum is taken 
locus by locus. For example, 1001 + 1100 = 1101 + 1000, meaning that if 
gametes 1001 and 1100 combine and as a result of recombination gamete 1101 
is produced then it is necessary that gamete 1000 be produced also. 

Note, that in general, Fg{Oi, O2) = G3 does not imply that Fi^g{Gi, G2) = 
1 - G3. For example, Fiioi(0010, 1100) = 0000 while Fooio(0010, 1100) = 1110, 
but 0000 + 1110 ^ ONE. However, when Gi= ONE and ^2= ZERO, then we 
have that if FgCGi, G2) = G3, then i^i_G(Gi, G2) = 1 - G3. It would be pos- 
sible to model recombination by pairs of recombinant operators (Fd, i^i-d), 
and in this case, the null hypothesis per excellence would be Mendel's rule of 
segregation expressed by equation 

R{Fg) = R{F,_g) (2) 
or in Geirenger formulation as 

R{0)=R{1-G) (2') 

Mendel's rule (2, 2') assumes that recombination does not interfere with the 
performance of a given gamete relative to zygote formation. 



2 SOME EXAMPLES ON GEIRENGER DIS- 
TRIBUTIONS 

To fix ideas, let us consider some simple and important cases. 
2.1 The two loci case 

Recombination is evidenced by output gametes 10 and 01, while no recom- 
bination is demonstrated by output gametes 00 and 11. When Mendel's law 
of segregation (2') is obeyed, a single parameter c = R{01) + R{10) (equals 
the probability of recombination) defines the whole process. In effect, by (2'), 
i?(10) = i?(01) = c/2, and given that the sum of all probabilities renders one, 
then i?(00) + R{11) = 1 — c, the probability of no recombination. Again by 
applying (2'), we have i?(00) = i?(ll) = (1 - c)/2. Without Mendel's law of 
segregation three parameters would be necessary to define recombination. The 
only granted information we have is the normalizing condition X^i?(G) = 1. 
Since recombination is not a deterministic phenomenon it must be modeled by 
a random variable, with a given expected value and a given expected variance. 
In the two loci case, the probability of recombination, c, has a maximum ex- 
pected value of 1/2, as we show below. To recombine, gametes must interact. 
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but since they are physically stable DNA structures, they must be activated 
with some specific energy. So, let us consider the following chemical reaction 
model, as in figure 1: 

Gi + G2 ^ (G1G2)* — > Ga + Gi 

where Gi, G2 are the gametes that gave origin to the output gametes G3, G4, 
while (G1G2)* is the activated complex. Gi, G2 have the probability v to go into 
the activated complex (G1G2)* which can then split into output gametes G3, G4 
with probability s. Former gametes Gi,G2 can be recovered with probability 
t. The maximum value oiv is 1, while t and s can be considered equal to one 
another, since t-\- s = \, s = 1/2. Since the probability of recombination c = vs, 
then c = 1/2 is at maximum. This bound could serve to evidence selection 
but since this value has sampling variance, calculations involving the expected 
variance of c must be carried out (See Karlin et altri, 1978). 

2.2 The three loci case 

In the three loci case there are eight types of recombinant gametes (2'^) whose 
probabilities are to be specified. Mendel's rule provides 4 constraints: 

i?(000) = i?(lll), 

i?(001) = i?(110), 

i?(010) = i?(101) 

i?(011) i?(100). 

Moreover since we have the condition ^ -R(G) — 1 three more independent 
equations are required. This can be reduced to R{G) for three gametes lying in 
different equations among the four just enumerated. But there are other forms 
to produce three independent equations. We are going to present two of them: 
the exclusive and the inclusive representations. 

The exclusive formulation involves the following parameters: 

1. r: the probability of recombination between the first two loci without 
recombination between the last two r = R{011) + R{100) (from left to 
right). 

2. s: the probability of recombination between the last two loci without 
recombination between the first two, s = i?(110) + i?(001). 

3. t: the probability of simultaneous recombination between the first two loci 
and the last two: t = i?(101) + i?(010). 

The inclusive formulation and its relation to the exclusive one is established 
by the following parameters: 

1. u: the probability of recombination between the first two loci with or 
without recombination between the last two, u = R{010) + R{101) + 
i?(100) + i?(011) = r + t. 
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2. v: the probability of recombination between the last two loci with or 
without recombination between the first two, v = i?(101) + i?(010) + 
i?(110) + i?(001) = t + s. 

3. w: the probability of recombination between the first two loci without 
recombination between the last two, plus the probability of recombination 
between the last two without recombination between the first two, w = 
R{On) + i?(100) + i?(110) + i?(001) = r + s. 

2.3 The four loci case 

In the four loci case wc have 2'* = 16 types of gametes. Mendel's law furnishes 
8 equations, with the normalizing condition J2 -R(^) = 1 we have a total of 9 
equations. Therefore, 7 other equations are required to completely specify the 
ensemble of probabilities of each kind of output gamete. 

The parameters corresponding to the exclusive representation of the recom- 
bination process can be defined in the following way: 

El = i?(1000) + i?(0111) 

E2 = i?(1100) + i?(0011) 

E3 = i?(1110) + i?(0001) 

E12 = i?(1011) + i?,(0100) 

Ei3 = i?(1001) + i?(0110) 

E23 = i?(1101) + i?(0010) 

£^123 =i?(1010) + i?(0101). 

Sub indexes indicate the division at which recombination takes place. For 
example, £'12 means the probability of simultaneous recombination between loci 
1 and 2 and between loci 2 and 3. 

One possible inclusive representation can be derived from the exclusive one 
by changing in the second one, the connective "and" for the connective "or"; for 
example, given that Ei stands for the probability of recombination between loci 
1 and 2 and no recombination among any other loci, we define h as Ei + E12 + 
Ei3 + £'123, i.e., the probability of recombination between loci 1 and 2 with or 
without recombination among other loci, which is the same as the probability 
of recombination involving locus 1. All cases are: 

h = El + E12 + Ei3 + E123 

I2 = E2 + E12 + £^23 + £^123 
-'^S = £'3 + £■13 + £'23 + £'123 
Il2 = El + E2 + E12 + Ei3 + £'23 + £^123 

7i3 = El + E3 + E12 + Ei3 + E23 + E123 

I23 = E2 + E3 + E12 + Ei3 + E23 + E123 

I123 = El + E2 + E3 + E12 + £^13 + £^23 + £'123 

Observe that £23 = -^2 + ^3 where the sum is understood as "sum without 
repetitions" . 

These representations can be generalized to n loci, and with an increasing 
n it is possible to present more and more different types of representations. 
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However, any representation must have the same number of independent equa- 
tions: Mendel's law provides 2"/2 = 2"~^ equations, which together with the 
normalizing condition provides a total of 2"~^ + 1 equations. Therefore, any 
representation must have 2" - 2"-^ - 1 = 2 x 2""^ - 2""^ - 1 = 2"-^ - 1 
independent equations. 

3 BITSETS AND RECOMBINANT OPERA- 
TORS 

Binary notation of gametes and of recombinant operators is particularly useful 
when computers are used. Gametes are represented by vectors in a discrete 
space of n dimensions each one with two possible states. Nevertheless, proofs 
of assertions that are valid for n loci, n being any number up to infinite, get 
considerable simplification and mathematical beauty is improved by using set 
theory, for it is possible to represent a recombinant operator or a gamete by 
a set that is appropriately called bitset. Java includes inbuilt facilities to deal 
with bitsets (Rodriguez, 2009). A great effort has been invested to make this 
beautiful theory accessible to researchers and so it clarifies a previous version 
(Rodriguez, Christiansen and Hoenigsberg, 1988). 

Any recombinant operator is associated with a subset of natural numbers 
corresponding to those loci in which the given operator picks up the maternal 
allele. For example, to the recombinant operator (or gamete) 1001 corresponds 
the set {1, 4} while gamete 0101 corresponds to the set {2, 4} and the set {1, 2, 3} 
represents the recombinant mode 1110. Correspondence between binary num- 
bers and subsets of the set {1, is biunivocal, for to binary numbers 1 and 
corresponds respectively the notions "to be" and "not to be" in the repre- 
sentation set. Representation of gametes by bitsets was introduced by Schnell 
(1961). 

Let us review the elementary definitions and properties of set theory. 

Always, the reference frame in set theory is the universal set, which in our 
case is the set AT = {1, ...n}. It is said that ^ is a subset of B, noted A < B, if 
any element of A is in B too. Observe that A < A for any subset of A^. 

Operations are defined between subsets of N. The union takes a pair of 
subsets A and B, and produces and output AuB, which is a subset that contains 
all the elements that are in A or in B, without repetitions. The intersection of 
A and B, An B, is a subset consisting of all elements common to A and B. 
The subtraction of B from A gives a subset A — B which includes all elements 
that are in A but not in B. The symmetric difference between A and B, gives 
a subset AAB that equals {A — B) L) {B — A). And, finally, to any subset A is 
associated its set of subsets p{A) = {B, B is subset of A : A < B}. 

Venn's representation of these operations is presented in figure 2. 

When some property is to be visualized, it can be discussed on Venn's dia- 
grams; if 3 subsets are involved, then the discussion can be guided by a Venn's 
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diagram involving three circles in which there are three two-set and one three- 
set intersections. Some examples follow: let N = {1,....,8}; A = {1,2,3,4}, 
B = {3, 4, 7} then 

AiJB = {1,2,3,4,7} 

Af\B = {3,4}, 

A-B = {1,2}, 

B-A = {7}, 

[B - A] - B = {} = %, this is the empty set. 
^AS = {1,2}U{7}={1,2,7}. 

p{B) = {0, {3}, {4}, {7}, {3, 7}, {3, 4}, {4, 7}, {3, 4, 7}}. 
Observe that A- Bis different from 5 - A but Al^B = BAA. 




Figure 2. Venn's diagrams of set operations. Binary operation union: U; 
intersection: fl/ the unitary operation complement:'; difference: -, and 
symmetric difference: A (more explanation in the text). 

In general, it is said that an operation is commutative, if the order in which 
subsets are presented to the operation is not important AqB = BqA. It is said 
that an operation is associative ii {A Q B) Q C ~ A Q {B Q C). In this case, 
the parenthesis docs not matter. For example, while {A — B) — C is not equal 
to A - {B ~ C), AA{BAC) equals {AAB)AC. Union, U, intersection, n, and 
symmetrical difference. A, are commutative and associative operations over A'^. 
Moreover, AA0 = (A-0)U(0- A) = A for any A< N, i.e., is a neutral element 
for A, an element that plays with respect to A the same role as with respect to 
the sum of integer numbers; on the other hand, AAA = (A — A) U (A — A) = 0, 
i.e., A cancels itself with respect to the neutral element of A. 

Also, we have the distributive laws: 

AU {B nc) = (AU B) n {AU C) 
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^ n (B u C) = n B) u n C) 

which can be visuahzed through Venn's diagrams or, verbahzed, say, for the 
first law, in the following way: elements common to B and C gathered with 
those in A arc those same elements that are both in the union of A and B and 
in the union of A and C. 

The set N — A is noted A' and it is referred to as the complement of A. 
Mendel's law is written as 

R{Fa) = R{Fa') (3) 
or in Geirenger representation as 

R{A) = RiA') (3') 

Formula (3') is sometimes the most suitable, because it directly represents 
a recombination process, but when arithmetic is involved, formula (3) is better. 
Since a gamete is represented by a subset A of iV = {1, n} then a genotype is 
a couple of gametes, and given that we gave the female gamete in the first place, 
a genotype is and ordered pair of subsets (^1,^2) belonging to p{N)Xp{N) 
which is the set of all couples that can be formed with elements of p{N). 

4 EVOLUTION OF GAMETE FREQUENCIES 
IN A PANMICTIC POPULATION 

A panmictic population is one that has no reproductive barriers or biases with 
respect to random mating. As an introduction let us show that in a panmictic 
infinite population, random mating of diploid zygotes is equivalent to the ran- 
dom mating of the haploid gametes they produce. In fact, let us consider that 
in the population the frequency of females with genotype GiGj is equal to the 
frequency of males with the same genotype, equal to Pij and that each female 
contributes F gametes while each male contributes M gametes. Let us imagine 
that each individual throws its gametes into a common reservoir. Let pi be the 
frequency of gamete Gi and pk the frequency of gamete Gk- Since each male 
contributes a mimbcr of M gametes, and females F gametes, then we have: 
Pi = Mpu/iM + F) + Fpu/{M + F) 

+Mj:pij/{M + F) + Fj:pij/{M + F) 

= Pit + T,Pij 
similarly 

Pk = Pkk + YjPkl 

Therefore, the frequency of the haploid mating GiGj at the gamete reservoir 

is 

PiPk = {Pii + T,Pij){Pkk +Y.Pki) 

= PiiPkk + J2 PiiPkl + J2 PijPkk PijPkl- 

We include in this last equation all possible matings among diploid zygotes 
that will produce the zygote GiGk- matings among homozygotes, homozygotes 
with heterozygotes, and among heterozygotes. In short, we have proved that: 
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PiPk =Vik 

or that random mating of diploid zygotes is equivalent to the random mating 
of the haploid gametes they produce. 

Nevertheless, in this proof, there is one hidden assumption: there is no 
recombination anywhere! To remove this assumption we can proceed as follows: 
the frequency in the population of the individual GiGj is PiPj, this individual 
will produce gamete Gk with a given probability Rijk that depends on the 
recombination scheme. 

Therefore, at the reservoir, genotype GiGj will contribute gamete Gk in 
a frequency RijkPiPj- B\it Gk can come from many more genotypes, and the 
frequency of gamete Gk in the reservoir will be of the form ^ RijkPij , where the 
sum is extended over those individuals GiGj that, through recombination, can 
produce gamete Gk- Then, in the reservoir, the probability of mating among 
haploid gametes Gk and Gi is of the form 

PkPl = iJ2 RijkPiPj) 02 RmnlPmPn) RijkRmnlPiPjPmPn 

where sums extends over i, j, m and n. But this last expression is equal to the 
probability of the double event of mating zygotes G^Gj with those of genotype 
GmGn, with the condition that they will produce through recombination gamete 
Gk and gamete Gi. So 

PkPl RijkRmnlPijPmn 

The exact form of R^jk could be calculated from equations given below, 
although we will not write it explicitly. As before, one could assume that males 
and females produce different number of gametes. 

Hence, random mating among individuals is equivalent to random mating 
among their gametes! Thanks to this result we will deal with populations of 
gametes rather than with populations of diploid individuals. 

A genotype can be represented in Venn's diagrams as in figure 3. 

Recalling that numbers appearing in set representation of a gamete are those 

corresponding to maternal alleles, while paternal alleles are those in the comple- 
ment of the set, then a genotype (A, B) is homozygous for maternal alleles at the 
intersection Ar\B, homozygous for paternal alleles outside the set representators, 
i.e., at {A\JB)' = A'r\B' , and it is heterozygous at AAB = {A- B)\J{B - A) 
vn A — B site loci in which A has maternal alleles while B has paternal ones and 
in (B — A) site loci in which B has maternal alleles while A has paternal ones. 
Note that N = {Au B)iJ {AVJ B)' = {A(^ B)\J {A/^B) U (A U B)' . 

For example, if A'' = {1, 5}, the genotype ({1, 3, 5}, {3, 4, 5}) is homozy- 
gous for maternal alleles at loci 3 and 5. i.e., at {3, 5} while it is homozygous 
for paternal alleles at the intersection of the complements of A and B, i.e., in 
{2, 4} n {1, 2} = {2}, and it is heterozygous at loci 

( {1, 3, 5} -{3, 4, 5}) U ({3, 4, 5} -{1, 3, 5}) = {1, 4}. 

Here, we have that N = {1, 5}= {3, 5} U {2} U {1, 4} and that these three 
sets are pairwise disjoint, which means that they, taken by groups of two, do 
not have any member in common. 
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Which conditions must there be for genotype {B,C) to give a gamete A 
as product of recombination? In more technical words, which conditions must 
there be for a genotype {B, C) to guarantee at least one recombinant operator 
Fg, such that Fg{B,C) = A7 




(X) 



Gir\G2 (G1UG2)' G1AG2 

Figure 3. A genotype in set representation: a genotype is a couple of gametes 

(Gi, G2) the first maternal and the second paternal. Since a gam,ete is 
represented by a set, the elements belonging to it are numbers corresponding to 
those loci in which there are maternal alleles, and elements outside the set 
correspond to those loci whose alleles are paternal, then a genotype is 
homozygous for maternal alleles in Gi n G2, homozygous for paternal alleles in 
(Gi U G2)' and heterozygous in G1AG2. 

To fix ideas let us take N = {1,....,6} and A = {3,4}, then the genotype 
({2, 3}, {4, 5}) can recombine to give A, because recombinant operator -F{i 3 5} 
working over that genotype renders A. To see this it is helpful to turn to binary 
representation: A = {3, 4} is represented by 001100, the maternal gamete {2, 3} 
is denoted 011000, and the paternal gamete {4,5} is represented by 000110, 
the recombinant operator ^"{1.3,5} is 101010. Therefore, we can write equation 
Fg{B,C) = a as i^ioioio (011000,000110) = 001100. Note that recombinant 
operators Fooioii, f 101011 can serve too; these modes of recombination can be 
easily generated in view of the fact that the genotype is homozygous for loci 1 
and 6, and that therefore, at these loci, it does not matter which allele is picked 
up. But, at heterozygous loci there is only one possible allele to be picked up 
correctly. 

Some problems on recombinant operators cannot be solved: for example, the 
gamete {1, 3, 4} cannot result from recombination between gametes of genotype 
({2, 3}, {4, 5}), i.e., there is not recombinant operator Fq such that Fg{{2, 3}, {4, 5}) 
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gives {1.3,4}. This can bo scon from binary notation: wc must find Fq such. 
^(3(011000, 000110) = 101100, but at locus number one any recombinant oper- 
ator would produce a while 1 is required; in spite of this, with remaining loci 
there is no problem. 

In general, an individual with genotype {B, C) can produce a gamete A if 
and only if, first, A < B U C, for otherwise B and C would not have all the 
maternal alleles required to construct A. And, second, the paternal alleles of A 
are in A' , while the possible paternal alleles of B and C are in B' or C", so we 
must have that A' < B'uC = [BnC)' . So long as we have that D' < E' if and 
only if D > E, for any subsets D and E, then, condition A' < B'uC = (BflC)' 
can be reexpressed as B Ci C < A. 

In conclusion, genotype {B, C) can recombine to give A iff 

B^C <A<B^C (4) 
See figure 4, in which the different areas are indicated as numbers 1-6. 



B C 




Figure 4- A genotype {B,C) to produce a given gamete A as a result of 
recombination must observe the following relation: Br\C<A<BUC. In 
effect, to contribute maternal alleles A must be contained in B U C. But to 
contribute paternal alleles, A' must be contained in B' U C" — {B Ci C)' or 
equivalently, B (iC must be contained in A. 

Numbers in figure (4) have two possible interpretations: they are loci, and 
serve as a particular example, or, they arc ensembles of loci and serve as general 
case. Having in mind that B and C are respectively female and male gametes, 
these six areas are characterized by: 

• Area 1: the genotype is homozygous for allele 0; the recombinant operator 
does not matter. 

• Area 2: the genotype {B, C) is heterozygous, the mother B has allele 1 and 

the father C has allele 0, while A has allele 0. The recombinant operator 
must, in region 2, pick allele from the father, therefore the recombinant 
operator in region 2 is defined by a 0. 
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• Area 3: the genotype is heterozygous, the mother has 1, the father and 
the recombinant 1, so, the recombinant operator must be 1 to take the 
allele 1 from the mother. 

• Area 4: the genotype is homozygous with allele 1; the recombinant oper- 
ator does not matter, it can be or 1. 

• Area 5: the genotype is heterozygous, the recombinant operator must be 
to take the allele 1 from the father. 

• Area 6: the genotype is heterozygous, the mother B has a whereas the 
father has a 1, since the recombinant has a 0, the recombinant operator 
must be 1 to pick the maternal allele. 

To summarize, the recombinant operator can be noted, in binary notations, 
as _F_oi_oi, where the - sign stands for "either or 1". On the other hand, 
recombination matters only in the heterozygous areas 2, 3, 5 and 6 or in 
{B — C)U{C — B) = BAG. In short, this can be presented in a single equation: 



where gamete B is represented by binary number 011100 corresponding to 
areas 2, 3 and 4, while gamete C is represented by 000111 corresponding to areas 
4, 5 and 6. The recombinant gamete A is represented by 001110 corresponding to 
regions 3, 4 and 5. Moreover, the recombinant operator i^_oi-oi picks maternal 
alleles at regions 3 and 6 which join to form the set {A — C)U{C — A) = AAC. 
To continue, we must specify how to relate equation (5) to probabilities. In 
other words, knowing the probability of each recombinant operator, which is 
the probability of F_oi-oi? That probability will be noted as -R{3,6}/{2,3,5,6} 
and we have: 

-R{3,6}/{2,3,5,6} = RaAC/BAC 

and it must be equal to i^ooiooi +-R001101 +-R101001 +-R101101 • In set notation, 
this can be written as 

-R{3,6}/{2,3,5,6} = -R{3,6} + -^{3,4,6} + -^{1,3,6} + -^{1,3,4,6} 

Observe that any subindixal set in the left side of this equation can be written 
as {3,6} U B, where i? is a subset of {1,4} = {2,3,5,6}'. Moreover, B scans 
p({2, 3, 5, 6}'), which contains all subsets of {2, 3, 5, 6}'. On the other hand, in 
the right side, {3, 6} is a subset of (2, 3, 5, 6}. Therefore, we define. 



i^_oi-oi (011100, 000111) = 001110 



(5) 



R{Fb/a) = Rb/a = ^ R{Fbuc) 



for 



B < A 



(6) 



C<A' 



or in Geirenger representation. 



R{B/A) = R{BUC) 



for 



B <A 



(6') 



C<A' 
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Rb/a = R{B I A) is known as the marginal recombination distribution of B 
with respect to loci in A. In particular we have that Raac/bac is the marginal 
recombination distribution of AAC with respect to BAC, and corresponds to 
the condition of formation of A from {B,C). 

With this notation we can return to the dynamics of gamete frequencies: to 
find the frequency of gamete A in the offspring generation it is necessary to find 
the probability of formation of a given zygote whose genotype is {B, C) such 
that B CiC < A < B L) C, and then multiply this probability by the marginal 
recombination distribution of AAC with respect to BAC, which is noted as 
Raac/bac- I^i ^ panmictic population, the probability of formation of zygote 
with genotype {B, C) is, by definition, the product of the frequencies in the 
population of B and C, p{B), p{C) respectively. Therefore, the frequency of 
gamete A in the offspring generation in a panmictic population, p'{A), is 

p'{A)= J2 Raac/bacp{B)p{C) (7) 

Br\C<A<B\jC 

Equation (7) is general and valid for any number of loci. 

Likewise to the concept of marginal recombination distribution of the re- 
combinant operator B with respect to loci in A, Rb/a, a similar definition of 
the marginal gamete probability of a gamete B relative to loci in A, p{B/A), 
where B < A, can be given by 

p{B/A)= ^^(^UC) (8) 

C<A' 

The more interesting property of Rb /a is that 

E ^B/A = 1 (9) 
B<A 

an equation that can be demonstrated by the substitution of Rb/a and a 
change of variable (see figure 5): 




N 



Figure 5. The conditions B < A and C < A' are equivalent to D < N where 

D = BUC. 

B<A B<A C<A' D<N 
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where the pair of variables B and C arc replaced for by the single variable 
D which scans over all N, given that B scans over A and C over A' and that 
ALIA' = N. a similar equation for p{B/A) holds: 

^ p{B/A) =J2 ^P{BUC)=Y^ P{D) = 1 (10) 

B<A B<AC<A' D<N 

These equations justify the adjective of marginal recombination distribution 
and probability given to Rb/a and to p{B/A) respectively. 

5 CARDINALITY AND PARITY OF SETS 

In order to simplify the study of the dynamics of gamete frequencies, we need 
some definitions: the cardinal of a set A, noted as i^iA), is the number of 
elements in A. The parity of A, §(A) is defined as (— 1)'^'^ , i.e., §(^4) is 1, 
if #j4 is even and -1 if #A is odd (parentheses are omitted when confusion is 
unexpected). 

The cardinal and parity have the following properties: 
1. 

#(0) = O,§(0) = 1 (11) 

2. 

#{p{A)) = 2*^ (12) 

3. 

J2 §(-S) equals if A ^ and 1 if A = 9 (13) 

B<A 

4. 

^ §(CnB) equals if Ar\B^% and 2*^ otherwise. (14) 

C<A 

5. 

§(AAB) = §(^ U B) = §(A)§(S) i/ A n B = 0. (15) 

6. 

§(^UB)§(AUC) = §(BUC) whenever A,B,C are disjoint (16) 

7. 

§(^nB)§(AnC) = §(^n(i?AC)). (17) 

These properties can be proved as follows: 

1. The number of elements in is zero, which is an even number, therefore, 
the parity of is one. 
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2. The number of subsets of a given set A is 2" where n is the number of 
elements in A, because, first, there are 2 possibihties for each of n places 
or bits in a binary number, i.e., 2" binary numbers at all, and, second, it 
is possible to construct a biunivocal relation between binary numbers of 
n bits and subsets of a set of n elements. For example, let A = {1, 8} 
then to the subset {4, 5} corresponds the binary number 00011000, and to 
the subset {2, 4, 6, 8} corresponds the binary number 01010101 and so on. 

3. To demonstrate the third property, we begin with the following equation: 
= (-1 + 1)" = Y.C{n,k){-l)''l"-'' =Y.Cin,k) where the first 
and second sums extend from to n, and C(n, k) is the binomial coeffi- 
cient. A change of variable can be made: (—1)''' is the parity of a set B oik 
elements, and C(n, k) is the number of subsets of k elements chosen among 
a set A oi n elements, therefore ^ C(n, I)''' = X)b<a ^("-i i^^)§-^' 
where the sum in the right side runs k from 0, the cardinal of 0, to n, the 
cardinal of A. 

Now, ^C(n, A;)§i? can be written simply as X)§£^, where the E scans 
over all subsets of A, i.e., over p{A), and from the beginning we knew 
that this sum equals zero. In this demonstration, n can take any value 
different from zero, i.e., A ^ for in this case (—1 + 1)" = 0° which is 
an undefined form which can be easily evaluated: if ^ = , then ^ §B, 
where the sum is extended over B < A, reduces to §(0) = 1. 

4. To demonstrate (14), we need to calculate X^(7<a§(^ ^ So, let us 
follow figure 6: with the definitions 

D = C-B, 

E = CnB, 

C can be partitioned in C = DUE. Therefore, conditions A < N , C < A, 

B < N a.TC equivalent to D < A - B , E < {A n B). 

Hence, Y.c<A §(C n B) = Y.d<a~b T^ekahb §-^- 

If j4 n i? = 0, X]_E<AnB — 1; by the previous property. In that case, 
J2D<A-BJ2E<AnB^E = Ed<a1 = bccausc in A there are 2#^ 
subsets, then this sum equals 2*^; this includes the case in which A = ^. 

But, ii A n B =^ "^EKAnB ~ 0' by the previous property. In that 
case, J2d<a-b J^ekadb = J2d<a = 

5. Let us demonstrate now that if A n B = then ^{AAB) = ^{AU B) = 
§(y4)§(5). If A and .B are disjoint then AAB = {A-B)U{B-A) = AUB, 
and #{A U B) = #{A) + #{B). Therefore, 

§(^UB) = (-1)#(^UB) = ^ (_l)#(A)(_i)<,#B) ^ §(A)§(B). 

Prom this follows the sixth property: 

6. If A, B, Care pairwise disjoint, then §(AUB)§(AUC) = §(A)§(B)§(A)§(C), 
but this equals §2(A)§(B U C) = §(B U C) for §(^) equals either +1 or -1. 
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D = C-B E = Cr\B C = DUE 



Figure 6. The conditions A<N,C<A, B<N are modified by the 
change of variables D = C -B, E = CnB, C = DUE into D < A-B, 

E < (AnB). 

7. To demonstrate that §(A fl B)§(A n C) = §(A n {BAG), which is the 7th 
property, let us decompose An B as 

AnB = {{AnB)-{AnBnC))u{AnBnC) = {{Ar)B)-C)u{AnBnC) 

while 

AnC ={{AnC)-{Ar)BnC))u{AnBnC) = {{AnC)-B)u{AnBr)C). 

As these unions are disjoint, then 

§(A nB) = §((vl f)B)-C)u{AnBnC)) = §((^ n B) - C)%A n B n C) 

and 

§(A n C) = §((yl n C) - B) u (A n B n C)) = §((A n C) - B)§(yl n B n C). 

Therefore 

%{AnB)%{AnC) = §((AnB)-C)§(.4nBnC)§((AnC)-B)§(AnBnC) = 
{{{AnB)-C)^{lAnc)-B) = ^{{AnB)-C)u{Anc)-B) = §(An(BAC)), 

which is an equation vaUd without restrictions (please, draw a Venn's 
diagram with three circles to make everything clear). 

6 TRANSFORMED FREQUENCIES 

To simplify calculations on the dynamics of gamete frequencies, the concept of 
transformed frequency of gamete A may be worthy; 
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t{A)= ^ ^{AnB)p{B) 

B<N 



(18) 



The substitution X ^ {Ar\B), with X < A, andY = {B - A), with Y < A' 
and B = {Af^ B) VJ [B — A) = XUY gives place to a change of variable: 

t{A) = mp{x u n = E §(^) E ^ f) 

The domains of X and Y should be understood as: if B runs over all subsets 
of A'', then X = {Ar\B) runs over all subsets of A, while Y = B — A can contain 
any element not in A, therefore, Y moves upon A' . Now, recalling definition 
(8): 

p{B/A)= E P{B^C) 

G<A' 

we have at last that: 

t{A) = E E f U C) = E %{X)p{X/A) (19) 

X<A C<A' X<A 

Here, f(A) is the transformed frequency of gamete A in matriarcal set no- 
tation, §(X) is the parity of gamete X, and p{X/A) is the marginal probability 
of X with respect to loci in A. 

What interest would the transformed frequencies have if it would not be 
possible to restore normal frequencies from them? Therefore, the following 
equalities are welcome: 

p(A) = 2-"E§(^'^^)*('^) (20) 
p{B/A) = T*^ ^ §(B n C)t{C) given that B<A (21) 

C<A 

Since 

p{A/N) = ^ p(A U D) = p{A U 0) = p{A) (22) 

D<N' 

then (20) is a special case of (21), a reverse identity that will now be shown 
beginning with its right side multiplied by 2~#'^, and with a substitution of 
t{B) by its definition (18), 

Ec<A §(s n c)t{C) = Y.c<A n c) Y.d<n ^(c n D)piD) 
= J:D<NPiD) T.c<A n C)§(C n d) 

= ED<NPiD)Ec<AKCn{BAD) 
where we have recalling (17): 
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§(B n C)§(C nD) = §(C n B)§(C nD) = §(C n {BAD) 

The terms of the form J2c<A ^ (-^^-^)) vanish according to (14) if 
{BAD) nA^(l>. So, wc arc left with those terms for which {BAD) n A = 0. 
Let us keep in mind that by (21), i? < A and C < A. 

When {BAD) n ^ = 0, then (BAD) < A' and the most general reahzation 
of D is D = B\J L for L < A'. Remembering that B and L are disjoint because 
B < A and L < A', we verify that 

{BAD)nA = {BA{BuL))nA = {BABAL)nA = (0AL)nA = LnA = <D. 

Now, according to (14), 5Z§(C fl {BAD)), sum over C < A, equals 2*^ 
whenever {BAD) D A = $. Using this, replacing D hy B U L, and recalling 
definition (8) we have: 

p{D) ^{Cn{BAD))=Y:D<NPiD)2*^ 

= El<a' PiB U L) = 2*MB/A) 

which is the left side of (21) multiplied by 2*^ as required. 

The most simple calculation of a transformed frequency according to (19) 
and (8) is that of 

m = J2 l{X)p{X/$) = §(0)p(0/0) = ^ p(0 U B) = ^ p{B) = 1 (23) 

X<0 S<0' B<N 

this means that t(0) does not evolve over time. 

Transformed frequencies of one locus are very important. We note by ta the 
transformed frequency t({a}), which calculated according to ( 19) becomes: 

ta = t{{a}) = §0p(0/{a}) + §({a})p({a}/{a}) = p{%/{a)) - p{{a)/{a)) (24) 
This ta restores marginal frequencies according to (21) as 

p(0/{a}) = 2-i(§0t(0) + mt{{a])) - (1/2)(1 + U) (25) 

p{{a}/{a}) ^ 2-i(§0i(0)+§({«})i({«})) = l/2(l-ia) (25') 
Equations (25) and (25') will be referred to as (25). 

7 EVOLUTION OF TRANSFORMED GAMETE 
FREQUENCIES 

Equation (7), 

p'{A) = Yl Raac/bacP{B)p{C) 

BnC<A<BUC 

which describes the evolution of gamete frequencies, can be transformed into 
a more tractable equation: 
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t'{A) = RB/At{B)t{A - B) 

B<A 



(26) 



where t'{A) is the transformed frequency of the gamete A in the offspring 
generation, and. t{B), t{A — B) are respectively the transformed frequencies of 
B and A — B in the given generation as defined by (18) or (19); Rb/a is the 
marginal recombination distribution of B relative to loci in A related to Ra by 
(6). 

The rest of this section will be devoted to the proof of (26), whose demon- 
stration is required not for any transformed frequency t{A) but for t{N) only. 
This results from the fact that in ( 19) the marginal gamete frequencies p{B /A) 
are invoked for the case in which B < A, therefore A plays in (26), the role of 
the universal set A'^. 

Hence, we would like to demonstrate that 



t\N) = RBt{B)t{B') (26') 



B<N 



because Rb/n = Rb and t{A — B) is noted as t{B'). 

By the definition of transformed frequencies (18) we have that 

= E^<;v n N)p'{A) = E^<jv mp'{A) 

Recalling (7) and adopting Geirenger's notation, we have: 

t'{N) = ^ §(A) R{AAC/BAC)p{B)p{C) 

A<N BnC<A<BUC 

To transform this equation into (26), terms p{B) = p{B/N) and p{C) = 
p(C/N) need to disappear to give their places to transformed frequencies, so 
they can be replaced by their values according to inverse relations (21). 

t'W = E^<^ m J:Bnc<A<Buc R{AAC/BAC) x 

(2-" Ex<A. §(i?nX)t(X))(2-" Ey<iv UCnY)t{Y)) 

t'{N) = 2-2" 

12a<N '^BnC<A<BUC J:x<n Ey<n %{A)R{AAC/BAC)x 

%{Br\X)i{cr\Y)t{x)t{Y) 

Since A, X and Y are independent variables, it is possible to reorder this 
sum to get: 

t'(7V) = 2-2" Y W{X,Y)t{X)t{Y) (27) 
x<jvr<jv 
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where 



W{X,Y)=J2 ^{A)^{BnX)^{CnY)R{AAC/BAC) (28) 

A<N BnC<A<BuC 

To evaluate W{X, Y), we partition A, B and C into five disjoint components 
E, F, G, H, K such that (see figure 7): 



BAG 




T = GUK M = F[JE 



Figure 7. To calculate W{X,Y) in (28), we need to partition superset B,A,G 
into five disjoint subsets as this figure shows. 

A = EUGUH,B = FUGUH,G = HUEUK. 
Therefore, 

AAG ={EUGU H)A{H UEUK) 

= {EUGUH-HUEUK)U{HUEUK-EUG[JH) = GUK, 

and 

BAG ={FuGuH-HuEuK)UiHuEuK-FuGuH) 

= FUGUEUK^GUKUFUE. 
If we note T = GUK = GAK &n<lM = F\JE = FAE then 
AAG = T and BAG = TUM. Applying definition (6'), we get: 

R{AAG/BAG) = ^ R{AAG U /) = ^ R{T U /) (29) 

I<{BACy I<{TUM)' 

On the other hand : 

§(A)§(i?nx)§(cny) 

= §(£; U G U H)%{{F UG\JH)n X)§((iJ Li E Li K)nY) 

= 1{EAGAH)^{(0AFAGAH) n X)^{{0AHAEAK) n Y) 

= {{EAGAH){{IeAEAFAGAH) n X)fi{{GAGAHAEAK) n Y) 

= ^{eagah)^IIeafaeagah) n x)^{{gakaeagah) n y) 

= {(EAGAH){{{MAEAGAH) n X)^{{TAEAGAH) D Y) 

= {{EAGAH){{M n X)^{{EAGAH) n X)§(T n Y)^{{EAGAH) n Y) 

= §(M n X)§(T n Y)^{EAGAH)^{{EAGAH) n X)§((£;AGAif) n Y) 

= §(M n x)§(r n y)§((£'AGAi7) n A^)§(^AGAff) n x)^{{eagah) n f) 
= §(M n x)§(r n y)§((£;AGAi?) n (naxay)) 
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= §(M n X)§(T n Y)^{{EAGAH) n {XAY)') 



In short: 



§(yi)§(s n x)§(C n y) = §(m n x)§(t n F)§((^AGAfl') n {xay)') (30) 

In these steps we have used some properties of set operations U, A , n, § : 
Union of disjoint sets coincides with the symmetric difference; the operation A 
has a neutral element and the inverse of any set is the same set itself; A is 
associative and commutative, i.e., its order does not matter nor the parentheses; 
n is distributive with respect to U or A; is the neutral element of fl (See 
section 3). We also used property (17) of parity. 

Now, we can substitute (29) and (30) into (28) to get 

W{X, Y) = J2 n X)§(T n Y)^{{EAGAH) n {XAYy)R{T U I) 

To find the domains of the new variables, let us list all conditions we have: 

X < N, Y < N from (27) and A < N, B n C < A < B U C from (28), 
I < (TUM)'from (29), where A = ^UGUff, B = FUGUH, C = HUEUK, 
M = FU E andT = GUK as defined just after (28). The domains of new 
variables can be defined from these conditions in multiple ways, but we adopt 
one that will allow us great simplification; T < N (T is taken as an independent 
variable, which can run over all subsets oi N), M < T' (for M n T = 0), 
/ < (T U M)', E < M (since M = F U E), G < T (since T = G U K), 
H <{TU M)' (for H has a vacuum intersection with GVJ KVJ F E = T V^ M). 
Reordering: 



W{X, Y)=Y^ §(M n X)%{T n Y)R{T U /) ^ %{{EAGAH) n {XAY)') (31) 

Cl C2 

where Cl stands for T < iV, M < T' , I < {T U M)' and C2 stands for 
E < M, G <T, H < {TU My. As 

(EAGAH) n {XAYy = {E f] (XAF)') A(G n (XAF)') A(iJ n (XAY)'), 
and each of these three terms is disjoint from the others, then by (15) 

^{{EAGAH) n (XAYy) = %E n (XAy)')§(G n (XAr)')§(iJ n (XAYy). 
Therefore, the second term in (31 ) can be rearranged by (16) as 

Ec2 ^EAGAH) n (XAY)') 

= Ee<m Eg<t EH<iTuMy n iXAY)%{G n (XAF)')§(// n (XAY)') 
= Eekm %{En{XAYy) Eg<t §(Gn(XAF)') E/f<(TuM)' §(if n(XAF)') 

Applying ( 14) to each of these sums, we have that the terms that do not 
vanish fulfill 
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M n {XAY)' = 0; 
T n (XAY)' = 0; 

(T U M)' n (XAF)' = (T U M U (XAF))' = 0. 

These three equalities hold if XAY = N, i.e., if Y = X'. Replacing 
EAGAH by A, we get: 

Ec2 §((i^AGAif) n (XAF)') = E^<^ n (XAF)') = E^<^ n 0) 

= EA<iV§(0) = EA<ivl = 2" 

Let us replace this expression in (31) 

W{X, Y) = Eci §(Af n X)§(T n Y)R{T U /) Ec2 mE^GAH) n (XAF)') 
and replacing also Y by X', we have that 

W{X, X') = 2" ^ §(M n X)§(T n X')i?(T U I) (32) 
ci 

where CI stands for T < N, M < T', I < {T\JM)'. Now let, L = T U /, 
then (32) becomes 

W{X, X') = 2" Ec3 n X)§(T n X')R{L) 

where C3 determines the domains of the variables: 
L < N (the independent variable), 

T <L (for L = TUI), 

M < L' (since, from / < (TUM)' it ensues that Mnl = or M < AND, 
by construction, MnT = or M < T'. Therefore, M < T'n/' = (TU/)' = i')- 
Then 

VF(x,x') = 2"5^i?(L)5^§(Tnx') 5^ §(Mnx) 

L<]V T<L M<L' 

The sum over M imphes that W{X, X') is nonzero if L'nX = {LU X')' = 0, 
then LU X' = N, while the sum over T implies that the only important terms 
arc those determined by L fi X' = 0. Together, this is fitted if L = X. Then 
J2l<n ^{L) reduces to R{X) and by (14) 

W{x, X') = 2"i?(x) Et<x §(7^ n X') Em<x' §(m n X) 

= 2"i?(X)2#^2#^' = 2»ii(X)2#-^+#^' = 2"i?(X)2#^ = 22»ii(X) 
Now, coming back to (27) we have, at last, the required identity (26'): 

t'{N) = 2-2« Y.x<N T.x'<N 2'^^RiX)tiX)tiX') = Ex<N R{X)t{X)t{X') 

where one sum is omitted because X' is completely determined by X and 
can take just one value. 
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Example 1: One locus: According to (26) the evolution of the transformed 
gamete frequencies of one locus ta is given by 

t'a = EB<{a} RiB/{a})t{B)t{{a} - B) 

= R{%/{a})mt{{a}) + R{{a}/{am{a}m) 

= mi {a}) + R{{a}l{a}))t{m{a}) = *({«}) = t„ (33) 

because, according to (9), i?(0/{a}) + i?({a}/{a}) = 1 andt(0) = 1 from (23) 

(this implies that t'(0) = i(0)). On the other hand, from the inverse transform 

(21) we have that 

P'(W/W) = 2-#^'^>(§({a} n 0)t'(0) + §({a} n {a})t'{{a])) 
= 2-#W(§(0)i'(0) + §({a})i'({a})) 

= 2-1(1 - O = 2-^1 - U = KW/W) 

Or 

p'{{a}/{a})=p{{a}/{a}) (34) 

where in the last equality we have recalled identity (25); similarly 
p'(0/{a}) = 2-#{«>§(0 n 0)t'(0) + §(0 n {a})t'{{a})) 
= 2-1(1 +0 = 2-1(1 +O=P(0/W) 

Or 



p'(0/{a})=p(0/{a}) (34') 

These equations say that the transformed frequency of one single locus and 
its marginal gamete frequencies do not evolve in any way. This is simply the 
Hardy Weinberg law of equilibrium which is (34) with n = 1, under the assump- 
tion that zygotes have just one locus. 

Example 2: One simple calculation. Let N = {1,2}, p(0) = 3/10, p({l}) = 
p{{2}) = 2/10, p{N) = 3/10. We use (8) to calculate the diverse values of 

p{B/A)= PiBuC) 

C<A' 

p(0/0) = p(0) +p({i}) +p({2}) +p({l,2}) = 1 

p(0/{l}) = p(0 U 0) +p(0 U {2}) = 3/10 + 2/10 = 5/10 = 1/2 

= U 0) +P({1} U {2}) = 2/10 + 3/10 = 5/10 = 1/2 

Note that p(0/{l}) +p({l}/{l}) = 1. 

p(0/{2}) = p(0 U 0) +p(0 U {1}) = 3/10 + 2/10 5/10 = 1/2. 
p{{2}/{2}) = p{{2} U 0) +p({2} U {1}) = 2/10 + 3/10 = 5/10 = 1/2. 
p{C/N) = p{C) for any C according to (22). 
The transformed frequencies are calculated according to (19): 

HA) = J2 m J2p(^^^)=H §(^)p(^m) 

X<A C<A' X<A 
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f(0) = 1, as stated in (23 ). 

*({!}) = m)pmm) + §({i}m{i}/{i}) = 1/2 - 1/2 = 0. 

ti{2}) = §(0M0)/{2}) + §({2})p({2}/{2}) = 1/2 - 1/2 = 0. 

t({l,2}) ^ §((0M(0)/{1,2}) + §({1M{1}/{1,2}) + §({2}M{2}{1,2}) + 
§({1, 2}p{{l, 2}/{l, 2}) = 3/10 - 2/10 - 2/10 + 3/10 = 6/10 - 4/10 = 1/5. 

Observe that t(0) + t{{l}) + 1{{2}) + t{{l, 2}) 1. This means that, until 
now, the only way to generate numbers that can serve as transformed frequen- 
cies, i.e., that as a result of inverse transformation (20) could restore positive 
numbers summing up to one, is to transform ordinary frequencies according to 
the conventional definition of transformed frequency (18). 

Let us verify for some cases the inverse transform (21): given that B < A, 
then 

p{B/A)=2-*^j:c<A^iBnC)t{C). 
p(0/0) = 2-#®§0t(0) = 2-0 = 1. 

p(0/{l}) = 2-#{i}(§0t(0) + mm) = 1/2(1 + 0) = 1/2. 
p(0/{l, 2}) = 2-#{i-2}(§0i(0) + + s^0i(|2}) + §0i({i^ 2})) 

= 1/4(1 - - + 1/5) = 1/4(6/5) = 6/20 = 3/10. 

p({i}/{i, 2}) = 2-*i^^^Hm}r^m<i)+mnn{i})ti{i}+m}r^m)tm) 

+§({1} n {1, 2})t({l, 2})) = 1/4(1 - + - 1/5) = 1/5. 
p({i, 2}/{i, 2}) = 2-*i''^}mh 2} n 0)t(0) + §({1, 2} n {l}t{l} 

+§({1, 2} n {2})t{{2}) + §({1, 2} n {1, 2})t({l, 2})) 
= 1/4(1 - - + 1/5) = 1/4(1 + 1/5) = 6/20 = 3/10; 

Let i?(0) = i?({l,2}) = 1/3 and i?({l}) = -R({2}) = 1/6; the probability 
of no recombination is i?(0) + i?({l,2}) = 2/3. The marginal recombination 
distribution is defined for B < A by (6'): 

R{B/A) = J2 R{BUC) 

C<A' 

i?(0/0) = i?(0) + R{1]) + R{{2]) + i?({l, 2}) = 1 
i?(0/({l}) = R{% u 0) + i?(0 U {2}) = 1/3 + 1/6 = 1/2; 
i?({l}/{l}) = R{{1} U 0) + R{{1} U {2}) = 1/6 + 1/3 = 1/2; 
We have that 

i?(0/{l}) + i?({l}/{l}) = l. 

i?(0/{2}) = R{% U 0) + i?(0 U {1}) = 1/3 + 1/6 = 1/2. 
i?({2}/{2}) = R{{2} U 0) + R{{2} U {1}) = 1/6 + 1/3 = 1/2. 
R{C/N) = R{C) for any C according to (22). 

The transformed frequencies in the offspring generation are given by 

t'{A)^Y.B<AR{BlA)t{B)t{A-B) 
t'i9) = 1 

t'{{l}) = i?(0/{l})t(0)t({l} - 0) + R{{l}/{l})ti{l})t{{l} - {1}) 

= (l/2)(0) + (l/2)(0)==0; 
t'{{2}) = R{9/{2})t{m{2} - 0) + R{{2}/{2})t{{2})tm {2}) = 
t'{{l, 2}) = i?(0/{l, 2})t(0)t({l, 2} - 0) + i?({l}/{l, 2})t{{l})t{{l, 2} - {1}) 
+i?({2}/{l,2})i({2})i({l,2}-{2}) 
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+i?({l,2}/{l,2}W{l,2})t({l,2}-{l,2}) 
= (l/3)(l/5) + (l/6)(0) + (l/6)(0) + (l/3)(l/5) = 1/15 + 1/15 
= 2/15 



The diverse p' of tlie offspring generation can be calculated from the t'. 
Applying equation (21), we have 

p'{B/A) = 2-*"^ §(B n C)t\C) given that B<A 

C<A 

p'(0/0) = 2-#0§0t'(0) = 2-0 = 1. 

/(0/{l}) = 2-#{i}(§0t'(0) + §0i'({l}) = 1/2(1 + 0) = 1/2. 
p'({l}/{l}) = 2-#{i}(§0t'(0) + §{l}t'({l}) = 1/2(1 - 0) = 1/2. 

y(0/{2}) = 2-#{i}(§0t'(0) + §0t'({2}) = 1/2(1 + 0) = 1/2. 
p'({2}/{2}) = 2-#{i}(§0t'(0) + §{2}t'({2}) = 1/2(1 - 0) = 1/2. 

p'(0/{l, 2}) = 2-#{i.2}(§0t'(0) + §0i'({l}) + Wm) + §0t'({l, 2})) 

= 1/4(1 + + + 2/5) = (1/4)(17/15) = 17/60. 
P'({1}/{1, 2}) = 2-#(i.2}(§(|i} n 0)^,(0) + n {l])t'{{l] 

+§({1} n {2})t'({2}) + §({1} n {L 2})t'({l, 2})) 
= 1/4(1 - + - 2/15) = (1/4)(13/15) = 13/60. 
P'({2}/{1, 2}) = 2-#{i.2}(§({2} n 0)t'(0) + §({2} n {l})t'({2} 
+§({2} n {2))t'{{2)) + §({2} n {1, 2})t'({l, 2})) 
= 1/4(1 + - - 2/15) = (1/4)(13/15) = 13/60. 
p'({l, 2}/{l, 2}) = 2-#ii.2}(§(|i^ 2} n 0)i'(0) + §({1, 2} n {1K{1} 

+§({1, 2} n {2})t'{{2}) + §({1, 2} n {1, 2})t'{{l, 2})) 
= 1/4(1 - - + 2/15) = 1/4(1 + 2/15) = (1/4)(17/15) 
= 17/60; 

Please, verify that marginal frequencies must add up to ONE. 



8 FIXED POINTS OF GAMETE DYNAMICS 

We already noted, in (33) and (34), that the transformed frequencies and the 
marginal frequencies relative to one locus are stable notwithstanding reproduc- 
tion with random mating. However, gamete frequencies for other than one-locus 
would evolve to fulfill (26). Would this evolution end in an equilibrium state, 
in a fixed point, or would frequencies skip frirther and further? Let us begin to 
answer to this question by proving that for each initial condition there is a given 
fixed point which differs for different initial conditions and so, equilibriums are 
in general unstable. 



Using the ordinary symbol G as in a G A meaning that a belongs to A, we 
define 
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aeA 



(35) 



where the letter e is the first letter of equilibrium, for we are going to prove 
that once the system arrives at ca, for each A, it remains invariable: 

b'a = ea (36) 

where stands for the value of ca in the offspring generation. To demon- 
strate (36) let us suppose that transformed frequencies t{A) are equal to ca for 
each A. Let us substitute by its value from (26): 

e'A=Yl R{B/A)eBeA-B = ^ R{B/A) l[tb [] 

B<A B<A bGB ceA-B 

e'A=Y. ^(^M) n = n ^« E ^(^/^) =J{tc.=eA 

B<A aeA aeA B<A aeA 

where we have recalled that B<Ato make {A — B) \J B = and the fact 
that R{B I A) is a density function over A whose integration over the universal 
set A renders 1. We have proved that ca defines indeed an equilibrium value. 
Let us relate its expression with frequencies. To that aim, let us now suppose 
that each t{A) equals e^, then by applying the inverse transform (20+ 21), we 
can calculate the equilibrium value of p{A) for any A: 

V{A) = p{AIN) = 2-" Y.B<N n B)eB = 2"" EB<iv n B) n^^^ 

To proceed further, let us learn from an example. Let A'' = {1,2,3,4} and 
A = {1,3}, then 

aeA beA' 

= {l-ti){l-t3){l+t2){l+t4) 

= {l-ti){l-t3){l+t2 + U+t2U) 

= (1 - t3){l +t2+t4+ t2ti -tl- tlt2 - titi - tit2ti) 

= (1 + ^2 + ^4 + ^2^4 — tl — tit2 — tit^ — tit2t4 — t3 — t3t2 — ^3^4 — t3t2t4 

+^3^1 + t3tit2 + t3tit4 + t3tit2t4) 
= 1 — + ^2 ~ ^3 + ^4 ~ ^1^2 + tsti — tit4 — t3t2 + t2t4 — t3t4 + t3tit2 — tit2t4 

-\-t3t1t4 — ^3^:2^4 + t3tit2t4 

= t{$) + §(^ n {i})ti + ^{A n {2})t2 + %{A n {3})t3 + §(^ n {A})t4 

+§(An{i, 2})tit2+§(An{i, 3})<it3+§(^n{i, 4})tit4+§(An{2, 3})i2t3 
+§(^ n {2, A})t2t4 + §(A n {3, A})t3t4 + §(^ n {i, 2, 2,})tit2t3 

+%{A n {1, 2, A})tlt2t4 + %{A n {1, 3, A})tlt3t4 + 1{A n {2, 3, A})t2t3t4 

+ §(An{l,2,3,4})tit2t3t4 

= Eb<jv n B) n„gB ta where t(0) = 1 and §(0) = 1. 
Therefore, we have in general: 
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^ ^{AnB)l[ta=l[{i-t^)l[{i + U) 

B<N aeB aeA beA' 

Using this identity to rewrite p{A), we get: 
p{A)=2--llaeAi^-ta)UbeA'i^ + tb) 
P{A) = UaeA^ - ta)/2 UbeA'i^ + ib)/2 

where we have used the fact that n = #A + #A' to introduce 2" into the 
productores. 

RecaUing identity (25) we can write the equiUbrium value of the frequencies: 
P{A) ^\{p{{a}/{a})'[{p{%/{b}) (37) 

aeA beA' 

This expression is simply the product of ordinary frequencies of maternal 
alleles at loci in A by the product of frequencies of paternal alleles outside A. 

Equation (37) shows that the system has a fixed point that depends on 
the initial frequencies and docs not depend on the scheme of recombination. 
For this reason, equilibriums are in general unstable. So, to relate this theory 
with experiment, it is mandatory to prove that the scientific research does not 
interfere with nature. 



9 DISEQUILIBRIUM MEASURES 

We have the expectancy that an initial condition would evolve toward the equi- 
librium defined by it according to (37). To study this pretension, we need in 
first place to measure the difference between actual frequency p{A) and its equi- 
librium value given by WaeN Pi{^} / {'^■\)T\beA' Pi^ / i^}) ■ That difference can 
be measured directly by the function p{A) — WaeN P{W} I W}) WbeA' P{^l i^})- 
This is the most elementary and direct form to measure gametic disequilib- 
rimn. But since p(A) and riaew?'({'^}/{'^}) n6eA'P(^/{^}) ^'''^ unequivocally 
determined by tA and e^, then we define the gametic disequilibrium by : 

dA = tA- eA (38) 

Another measure of gametic disequilibrium is by means of the Bennett mea- 
sure of gametic disequilibrium Da defined by 

DA=J2^i^-B)p{</>/B) n P(0/W) (39) 

B<A aeA~B 

These two forms to measure gametic disequilibrium are related through the 
following identities: 

Da = 2-*^ - B)dBeA-B = 2"*^ ^ %{C)dA-cec (40) 

B<A C<A 
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B<A C<B,C^$ ceB-C 

The two expressions of Da in (40) are equivalent and results from a change 
of variable: C = A — B, which, given that B < A, implies that B = A — C. 

Let us prove (40). 

The proof for case A = N follows. Equation (39) now reads: 
Dn = Eb<n m ~ B)p{9/B) naeiv-BP(0/W) 

= EB<iv §(^'M0/5) n„eB'P(0/W) 

So, let us replace in this equation the marginal probabilities by their trans- 
formed frequencies in accordance with (21) and (25). Equation (21) reads: 

p{B/A) = 2-*^J2c<A n C)t(C) given that B < A. Therefore: 
p(0M) = 2-#^Ec<A§(0nC)t(C) 
p(0/B) = 2-#^Ec<B§(0nC)t(C) 

while equation (25) is: 

p(0/{a}) = 2-i(§0t(0) + mt{{a})) = (1/2)(1 + i„) 
Replacing these equations in the last version of (39), we get: 

Dn = Eb<n %iB'){2-*^ Y.c<B W n C)t{C)) WaeB' (1 + to)/2 

= Eb<^ 2-#^§(i3') Ec<B m2-*B' n„eB' (1 + ta) 

because §(0 n C) = 1 and inside the productore ( Y\) there are #B' terms 
divided by 2. As 2-*^2-*^' = 2"", wo get: 

Dn = 2-" EB<iv Ec<B mm UaeB' (1 + 
Let us notice now that: 

(1 + ii)(l + is) = {l + h+t2+ tit2) 

(1 + ti){l + t2)(l + ta) = {l+ti+t2+ tlt2)(l + h) 

= l+ti+t2 + tit2 + t3 + hts + t2h + ^1*2^3 
= 1 + tl + f2 + + tlt^ + tits + 42*3 + tit2ta 

So, we have in general: 

riaeS' (1 + = Ea<B' riae a *a — Ea<B' 
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Therefore 



Dn = 2-" EB<iV Ec<S m'XC) Y.A<B' 

Dn = 2-" EB<iv Ec<s Ea<b' m')t{C)eA 



Figure 8. This chart represents the situation given byB<N,C<B,A< B' , 
where N is the universal set. The change of variable K ^ B — C renders the 
equivalent set of conditions: A < N , C < A' , K < {AVJ C)' . 

Now, let K = B — C (see figure 8), then we can make a change of variables 
with A<N,C <A',K <(AVJ C)' . We have: 

^iv = 2-" EA<iv Ec<A' Eif <(Auc)' §(C U Kyt{C)eA 

Lot us prove now that §(C U K)' = %C'%K. In effect, let us remember that 
K = B — C < C and so K U C = C\ that parity is ±1 so its square is always 
1, and then we apply property (15) of parity over disjoint sets: 

§((c u K)') = mc u K)') = fmic u K)') = mmic u ky) 

= W%{K u (c u KY) = WiiK u (C n k')) 

= u c") n {K u A")) = §i^§(C" n Af) = §i^§(C") 

= §(C")§^ 

Therefore: 

= 2- Ea<a^ Ec<A' Ea-<(Auc)' ^{C'mmC)eA 
Dn = 2~" E.4<.Y Ec<A' Ek<(Auc)' §(C^')§(^ n N)t{C)eA 
Dn = 2-" EA<Ar Ec<A' §(C^')i(t^)eA Eif<(AuC)' n iV) 

Here the important terms in the last sum are determined by (14), a condition 
that requires that A' fl (A U C)' = i.e., that AuC ^ N which implies that 
C = A', C = A, while the variable K runs over K < {Au Cy = <l), so K = 
while E §(-f^ n A') reduces to §(0) = 1. Therefore 

= 2-" W{A')eA (42) 
A<Ar 
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Recalling that rf^/ = t{A') — ea', we get: 
^jv = 2-" Ea<jv §(^)('^^' + eA')eA 

= 2-" Ea<n mdA'CA + 2-" EA<iv m)eA'eA 
If wc use the fact that ba'^a = ejv which can be factored in the second sum, 
we get: 

Dn = 2-" EA<iv m)dA'eA + 2-" Ea<n U^^n 

= 2-" Eakn ^Wa'Ca + Eakn §(^) = 2"" E.4<w §(^)c?A'eA 

where we have applied (13) to iV ^ to render J2a<n §(^) = ^- ^^"^ 
with 

Div = 2-"EA<iv§(^)'^A'eA 
which is (40) for A = N. 

Now, let us engage in the proof of (41) for ^ = A'': 

2*^m-B)DcneeB-cPi^/{c}) 

= m-R2 

where -Rl differs from djv in the terms involving C = and R2 is the 
corresponding compensation: 

^l = Es<iV Ec<B2*^§(S')/?Cnc6S-cK0/{c}) 

^2 = EB<iv 2#^§(iJ')^0 nceBP(0/{c}) 

To prove (41) it is enough to show that -Rl = t{N) and -R2 = ejv because by 
definition (38) dff = t{N) — ejv- 

Let us replace Dc in Rl by its value induced from (42) above and renumber 

in the inverse order: 

Dc = 2-*^ Ek<c m)t{C - K)eK = 2-*^ Y.k<c §(C - K)t{K)ec-K 

Ri=Y. ^2#^§(B')[2-#^ E §(^-^)w^c-^] n 

B<N C<B K<C ceB-C 



^1=E E E 2**''"''^§(S')§(C - i^) Wec-K n (l + ^c)/2 

B<N C<BK<C ceB-C 

where we have recalled (25) to replace the marginal probabilities. The last 
(1/2) can be factored out of the productore: 
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B<N C<BK<C ceB-C 

The productore can be expanded using a formula that was found above: 

or 

naeM(-'- + = 2a<M IlaeA *o — Y1,A<M 

If we apply this to Rl with M = B - C: 
^1=E E E 2*'''"'^^2-#(^-'^)§(B')§(C^-^)«ec-K E 

B<N C<BK<C 0<B-C 

^1=E EE E 2#(^-^)2-#(^-^)§(S')§(C-i^)Wec-xeo 

B<Ar C<B K<C 0<B-C 

^1=E EE E §(S')§(C-i^)«ec-Keo 

B<Af C<B K<C 0<B-C 

To realize what kind of situation we have, let us pay attention to figure 9. 

Since B' and C — K are disjoint sets, then by (15) 
§(S')§(C -K) = §(S'A(C - K)) = ^{NABACAK) 
because B' = NAB axid C - K = CAR since K <C. 
On the other hand, ec-Keo = ecAxeo = ecAKAO- 

Now, we can make a change of variables: let X = B—{CuO) and Y = C—K; 
therefore B = CUOUX = CAOAX and C = K U Y = KAY. Simi- 
larly NABACAK = NACAOAXACAK = NAOAXAK and CAKAO = 
KAYAK AO = YAO andB-C = XuO = XAO. We have at last that 
§(B')§(C -K) = ^{NAOAXAK) = ^{NAOAK)%{X) and ec-Keo = eyAO- 

In short, the change of variables is: 

K is independent variable. 

0<K' 

Y = C-K; 

X = B-{CUO) 

§(S')§(C -K) = ^{NAOAXAK) = §(iVAOAif)§(X) 
ec-Keo = evAO 

Let us notice that 

^1 = Eb<n Eckb Y.k<c Eokb-c §(5')§(C - K)t{K)ec-Keo 
has the general form 

m = E §(S')§(C - K)t{K)ec-Keo 

and with the change of variables, this expression takes the form 
Rl = J2m^OAK)^{X)t{K)eYAO 
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Figure 9. This is a graphic representation of the following conditions: B < N, 
C <B, K <C, O <B -C. New disjoint sets X = B-{CuO) (in stars) 
and Y = C — K (in lines) are indicated. 

Therefore, the domain of the sums can be defined by 

K <N,0 <K',Y <{K\J Oy, X < {K U O UY)' = N n (K U O U Y)', 

This set of conditions wih be referred to as F and we can rewrite 

= EF^iN^OAKmX)tiK)eYAO 
Let us separate the terms with §(X): 

R1 = Y: m^0AK)t{K)eYAO Ex<(xuouf)' §(^) 

Let us apply the law (13) to §(^): when X < (A'UOUF)', the only nonzero 
terms that go in this sum are those corresponding to the case {K U O U Y)' = 
0, i.e., X = 0, rendering that J2 §(-^) = 1- But if (ii' U O U F)' = then 
KUOUY = N and given that these three sets are disjoint this is equivalent to 
K' = YuO = YAO. Therefore 
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R1 = J2 ^{NAOAK)t{K)eK' 

Hence condition F reduces to K < N,0 < K' . So, 

Let us notice now that 

^{NAOAK) = §((0 U K)') = §(0' n K') = §0§0§(0' n K') 

= §0§(0 U {O' n K')) = §0§((0 U O') n (O U /sT')) 

= §o§(7Vn/f') = §o§/f' = §^'§o 

Therefore: 

= Y.k<nT.o<K' m'mO)t{K)eK' 

= EK<Nm'm)eK' Eo<K'm 

which gives after (13) that the important terms in the last sum arc when 
K' = (/} . i.e., K = N, then §(iC') = §(0) = 1 and Ck' =60 = 1. We are left 
with t{N) alone and we have proved, as it was promised, that 

iJl = ^ ^ 2#^§(i?')i?c n K0/W)=t(iV) (43) 

B<NC<B ceB-C 

Now, let us retake R2, which must be proved to equal ejy: 

= Eb<n ^*''%{B')D, nceBP(0/{c}) = EB<iv 2*^%{B')D,2-*^ Y.c<b 

= Ei3<ArEc<B§(^')eC 

where the productore of the marginal distributions has been expanded, as 
usual, into a sum of ec and we have used the fact that 0^ = 1. Let B — C = Z 

be a new variable. Then 
B = C\JZ 

B' = {cu zy 

and, as before, 
§(B') = §(C")§(^) 

With this change of variable, R2 becomes: 

^2 = Ec<iv Ez<c' §((^ U Zy}ec = Ec<iv ec§(C') Ez<c' §(^) 
according to ( 13) in the last sum there is only one nonzero term correspond- 
ing to C" = , i.e., C = N and therefore R2 = eAr§(0) = ejv- We have proved 
that: 

R2=Y1 2*''§(^')^0 n PC^/i^}) = E E §(-^060 = er, (44) 

B<N cGB b<nc<b 

This ends the proof that the two measures of disequilibrium d and D are 
related by (40+41). 
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One of the most interesting properties of the disequilibrium value Da de- 
fined by (39) is that it obeys the same recurrence equation as the transformed 
frequencies. If is the gamete disequilibrium in the offspring generation, then 

D'a=Y1 R{BIA)DbDa-b (45) 

B<A 

Thanks to this equation, one algorithm can be used to calculate both the 
evolution of transformed frequencies and that of gamete disequilibrum. 

To demonstrate (45) we express D'j^ in terms of transformed frequencies 
in the offspring generation, as stated in (42), reorder, and then replace these 
transformed frequencies by their values given by recurrence equation (26): 

D'n = 2-" Ea<a^ mt'{A')eA = 2-" EA<Ar ^{A')t'{A)eA- 
= 2-" Ea<n §(^')eA' Eb<a R{B/A)t{B)t{A - B) 

Recalling (6') to replace R{B/A), we get: 

D'j, = 2-^J2 J2 H MeA'R{B U C)t{B)t{A - B) (46) 

A<N B<A C<A' 

Now we need to express t{B) and t{A — B) in terms of the -D^s, which is 
done by inducing their values from t{N) in ( 43) : 

B<A C<B ceB-C 

m = j:b<a j:c<b 2*^§(^ - b)Dc2-*('^-^) j:k<b-c 

= T.B<A T.c<B Ekkb-c 2*^§(^ - B)DceK 

Please, make a graphic of the conditions above: B<A, C<B, K<B — C 
taking A as the universal set and verify the next change of variables: 

Let Y = B - {C U K) ov B = Y U C U K &nd A - B = A - {C UY [J K) 
which renders for B < A that §(A-S) ^{A-{CUYUK)) = ^{{CUYUK)'), 
where the complement is relative to A. We have, as it was done before: 

§((C U y U K)') = §(((7 U if U YY) = §(C U KYW = §(A - (C U K))W 

Therefore: 

^{A-B)=^{A-{CUK))^Y 

On the other hand, the set of conditions B< A, C<B, K<B — C is 
equivalent to C < A, K < A - C, Y < A - {C U K). Replacing this in t{A) 
above, we get: 

t{A) = EcKA Ek<a-c '^*''DceK%{A -{CUK) Ey<A-(cuK) §(^) 
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The last sum is ONE and corresponds to the case in which A — {CU K) = 0, 
i.e., A = CUKotK = A-C and because §(A - (C U K)) = §(0) = 1, finally 
we get : 

tiA) = ^ 2*^'DceA-c (48) 

C<A 

Therefore 

t{B)tiA -B) = (Ex<B 2#^Z)xeB-x)(Ey<A-B 2#^Z?ye(^_B)-y) 
and coming back to (46) 

D'n = 2-" Ec4 2*(''^''')DxDyR{B U C)eA'eB-xe(A-s)-r§(^') 

where C4 stands for the condition given hy A < N, B < A, C < A' , X < B, 

Y < A — B, a. situation that is visualized in figure 10. 

Since A' , B — X and (A — B) —Y are disjoint sets then we have that 

eA'eB-xe(A-B)-Y = eA'u{B-x)u{{A-B)-Y) = e(Yijxy 
which is a more tractable expression. 





N 
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C 
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Figure 10. Here we have the representation of: A<N,B<A, C< A! , 
X < B, Y < A — B. Subsets E = B — X (oblique lines) and 
F = A— {XuYuE) (horizontal lines) are indicated. 

To introduce a change of variables, let E = B — X , F = A — {X UY U E), so 
tha±A = XUYUEUF. Therefore §(^') = mXHYUEUF)') = ^{XUYUEyi{F) 
and 

R{B U C) = R{X UEUC). Hence, the old expression 

D'n = 2-" Ec4 2#(^^^)Dx£'f-R(S U C)eA'eB-xe(A-B)-Y%A') 
becomes 
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D'r, = 2-" ^ 2#(^^^)Z?x£'Fi?(X UEU C)e^xuYymX U F U £)')§(i^) 

C5 

where C5 is given by X < A^, y < X', £; < (X U Y)', C < {X U E U Y)' , 
F < {X UEUYLlCy. We can separate the sum of §(F) with the condition 
F< {XUEUYU Cy. This sum matters when {X U E UY U Cy = ID, i.e., 
X D E UY U C = N , or C = {X U E UYy and 

F < {X U E UY U cy = {{X U E UY) U {X U E U F)')' = iV' = 0. 

In short, F = and the corresponding sum adds up to 1. 

Then 

R{X LiELiC)= R{X \JEiJ{XiJE\J F)') = i?(XAFAiVAXA£;Ar) 
= R{NI^Y) = R{Y') 

D'n = 2-" Ec6 2#(^^^)D^i)ri?(y')e(^uy)'§((^ U F U i?)') 

where C6 stands for X < A^, F < X', E < {X \J F)', C < (X U U F)' 
and using and old trick §((X U F U Ey) = §((X U F)'))§(£;). We note that 
^ §(-E) can be factored with the condition that E < {X \J F)' rendering that 
(X U F)' = or X U F = iV, i.e. F = X'. We have 

^^v = 2-" Y.x<N 2#(^^^)Dxi?x'ii(X)e0§((X U X')') 

Using #(X U X') = 4{N) = n and eg = 1 while §((X U X')') = %{N') = 
§(0) = 1, we get at last: 

D'^ = Y.x<nDxDx'R{X) 

This finishes the proof of ( 45). 

Example. Let us calculate some instances of the initial and subsequent 
disequilibrium given by (39) and (45) respectively. Formula (39) for the initial 
disequilibrium reads: 

Da=Y. - B)p{%IB) n 

B<A aeA-B 

We have: 

1. L»0 = §(0)p(0/0) = 1, where wc have making Y{^^g,p{%/ {a}) = p(0/0) = 1 

2. D^,} = §({a} - %)p{%/%)p{%/{a}) + §({a} - {a})p{% / {a))p{% / %) 

= §({a})p(0/{a}) + §(0)p(0/{a}) - -p{% / {a}) + p{% / {a}) = 

= §({a, &}-0)p(0/0)p(0/{a})p(0/{&})+§({a, 6}-{a})p(0/{a})p(0/{6}) 
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+§({a, 6}-{&})p(0/{&})p(0/{a})+§({a, b}-{a, 6})p(0/{a, 6}M0/0) 
= p(0/{a})p(0/{&}) - p(0/{a})p(0/{6}) - p(0/{6})p(0/{a}) 

+M0/{a,&}) 
= p(0/{a,6})-p(0/{fc}M0/W)- 

4. i?{aAc} = Eb<A §({«, ^ C} - B)p{(D/B) lllae{aAc}-B] P(0/{«}) 

= §({a, 6, c} - 0)p(0/0)p(0/{a})p(0/{fe})p(0/{c}) 
+§({a, 6, c} - {a}M0/{a}M0/{&}M0/{c}) 
+§({a,6,c} - {6}M0/{6}M0/{a}M0/{c}) 
+§({a,6,c} - {c})p(0/{c})p(0/{a})p(0/{6}) 
+§({a, 6, c} - {a, b})p{^/{a, 6})p(0/{c}) 
+§({a, 6, c} - {a, c})p(0/{a, c})p(0/{6}) 
+§({a, 6, c} - {6, c})p(0/{&, c})p(0/{a}) 
+§({a, 6, c} - {a, b, c})p(0/{a, 6, c})p(0/0) 

= -p(0/{a})p(0/{&})p(0/{c}) 
+p(0/{a})p(0/{6})p(0/{c}) 
+p(0/{6})p(0/{a})p(0/{c}) 
+p(0/{c})p(0/{a})p(0/{&}) 
-p(0/{a,6})p(0/{c}) 
-p(0/{a,c})p(0/{6}) 
-p(0/{6,c})p(0/{a}) 
+p((0)/{a,6,c}) 

= p(0/{a, 6, c} - p(0/{a})p(0/{fe})p(0/{c}) 

+ (-p(0/{a, 6}M0/{c}) + p(0/{«})p(0/{fo})p(0/{c})) 
+(-p(0/{a, c}M0/{6}) + p(0/{&})p(0/{a})p(0/{c})) 
+(-p(0/{6, c})p(0/{a}) + p(0/{c})p(0/{a}M0/{6})) 

= p(0/{a, b, c}) - H0/{a})p(0/{6}M0/{c}) 

b}) p{(ll/{a})p{(l>/{b})) 

-p(0/{&})(M0/{«, 4) - M0/{«}M0/{c})) 
-p(0/{a}) b(0/{6, c}) - p(0/{6}M0/{c})) 

= (p(0/{a,6,c}) -p(0/{«})p(0/{&})p(0/{c})) 
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-p(0/{c})D{„.b} 
-p(0/{6})i?{„,,} 
-p(0/{a})i?{6,c} 

5. £1^ = £'0 = 1. We use D'^^ = Y.b<aR{B/A)DbDa-b. 

6. D[^^ = i?(0/{a})£)0D{„} + i?({a}/{a})D{„}£)0 

= 

7- D[^,,>}=T.B<AR{B/{a,b})DBD{^,,}_B 

= R{%l{a, b})D^D{aM + Ri{a}/{a, b})D^a}D^b} 
+Ri{b}/{a, b})D[,}D[a} + R{{a, b}/{a, b})D[a,b}D0 
= R{%/{a, + R{{a, 6}/{a, b})D{^,^D^ 

= (i?(0/{a, b}) + R{{a, b}/{a, 

8- D[^.l>.c}=T.B<ARiB/{aAc})DBD{a.b.c}-B 

= i?(0/{a, b, c})i?0i?{„.6.e} + i?({a}/{a, b, c})D{a}D{^^,} 
+R{{b}/{a, b, c})D^b}D[a^,} + Ri{c}/{a, b, c})D[,]D^a,b} 
+R{{a, b}/{a, b, c})D^a.b}B'{c} + R{{a, c}/ {a, b, c}I?{a,c}-D{h} 
+R{{b, c}/{a, b, c})D[i„c}D[a} + R{{a, b, c}/{a, b, c})£>{„.i, ,,}D0 
= -R(0/{a, b, c})D^a,b,c} + R{{a, b, c}/{a, b, c})D^a,b,c} 
= {R{%/{a, b, c}) + R{{a, b, c}/{a, b, c}))£>{„,6,c} 

Using these examples we have no trouble with the universal set N, for it can 
be any set. We have, moreover, shown that for gametes with zero, one, two or 
three loci, the value of the gametic disequilibrium in the offspring generation of 
a given gamete is equal to the product of its gametic disequilibrium in the given 
generation, multiplied by the probability of no recombination among loci in the 
gamete. 

10 FIXED POINTS AS LIMIT POINTS 

We have developed formulas to calculate evolution of transformed frequencies, 
fixed points of the dynamics and gametic disequilibrium. We discovered that 
each initial condition gives rise to a fixed point of the dynamics and so we 
reasonably expect that each initial condition is absorbed by its corresponding 
fixed point. Nevertheless, things are not that easy: 

Let us. Dear Reader, confront you with the next objection: fixed points 
cannot be limit points of the dynamics because the recurrent formula for the 
evolution of transformed frequencies (26) has the same form as that for the 
evolution of disequilibrium (45). In fact we have: 



38 



t'{A) = RB/At{B)t{A - B) 

B<A 

D'a=Y1 R{B/A)DbDa-b 

B<A 

Therefore, were the fixed points be the hmit points of the dynamics, then 
the disequihbrium would tend to zero. But, because the transformed frequencies 
evolve according to the same law, transformed frequencies would also tend to 
zero and so the population would disappear. Hence, the objection predicts that 
fixed points are isolated from the general evolution of the diverse frequencies. 

We expect you to have a lot of fun trying to make concepts clear. Hint: 
the first step to solve this problem is to explicitly calculate the disequilibrium 
after an arbitrary number of generations for a number of loci equal to 1, 2 or 
3 and to contrast these values with the corresponding ones for the transformed 
frequencies. 

11 THE EFFECT OF MIGRATION 

Until now we have dealt with a panmictic population. In this section we would 
like to introduce demification into the population with a general scheme of 

migration (a deme is a subpopulation). Let us show that the migration of 
individuals originate the same change on relatives frequencies of individuals 
and of gametes. 

Let us model migration as a discrete operation whose unit of time coincides 
with that of reproduction. Before migration, the breeding population in deme 
number i is rij and let rriij be the migration rate per individual from deme i to 
j. Then, after migration, the population n™ is: 
= ni- Y^j^i mijn^ + J^j^i mjiTij 

where we have included an outflow and an inflow caused by migration. Or- 
ganizing, we get: 

nf* = (1 - ^ mij)ni + ^ mjiTij 

Let us consider now the effect of migration over genotype freqiiencies. Be- 
fore migration, the frequency of individuals with genotype {B, C) at deme i 
is Pi{BC) = Pi{B)pi{C) and after migration is p™(B)p™(C). Likewise, the 
equation that includes the effects of migration is: 

pT{B)pnC) = (1 - Ej mMB)Pi{C) + mijPj{B)pj{C). 

Summing up over C < N, factoring terms with gamete B and recalling that 

Epr(C) = EMQ = EpAC) = 1, we get 
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pT{B) = {l-Yl ^MB) + m,,p,{B) (49) 

This equation can be multiplied at both sides by ^{Af\B) and then summed 
up as -B < N, then 

T.B<N n B)p-^{B) = (1 - E,^j my ) T.B<N n 5)^(5) 

+ E.^,™.«Ei3<w§(^nB)p,(B) 

Since the sum is given over B < N we can recognize here an equation 
regulating the migration of transformed frequencies: 

c (^) = (1 - E (^) + E ^-j'*^- (^) (50) 

We see that migration of individuals, of gametes, and of transformed frequen- 
cies, all have the same form. This property can be generalized to the marginal 
probabilities too, for they arc smns of some normal frequencies, but equilibrium 
frequencies require special treatment given by (35) and (50): 

eS = n - E "^ij^^ib + E ^^ij^^b] (51) 

beB 

A model involving reproduction and migration can be formulated if we up- 
date equation (50), where primes mean the offspring generation: 

t'riA) = (1 -J2mij)t',{A) + J2'^jit'j{A) (52) 

In order to see the effect of migration, it is necessary to transform this 
equation in an expression containing disequilibrium terms. To this aim, let us 

elaborate the term: 

t'{A) = j:RiB/A)t{B)t{A-B) 

by replacing t{B) by (Ib + cb and t{A — B) by dA-B + ba-b, we have: 

t'iA) = E RiB/A)idB + eB){dA-B + ga^b) 

t'{A) = R{B/A){dBdA-B + dBCA-B + ^BdA-B + eseA-B) 

since Cb^a-b = p-a and Eb<a^(^M) — 1' ^^^^ 

t'iA) = E R{B/A){dBdA-B + dBCA-B + eBdA-B) + CA (53) 

In particular, for any deme, the disequilibrium after reproduction alone is 
d'^ = t'{A) — ca because reproduction does not change equilibrium frequencies. 
Then 

d'A = t'{A) - eA = E R{B/A){dBdA-B + deCA-B + eedA-B) (54) 
The disequilibrium after one round of reproduction plus migration is given 

by: 
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Invoking (52). wc get: 

dT^A) = (1 - E + E ^.-.i^A) - ej^^) 

Recalling (51) and (53), we have: 

i(B)di{A-B)+di{B)e-i(A-B)+e-i(B)di{A-B) +ei(A) ) ) 



+ 'y^^rnij(^^R{B / A)){dj(B)dj{A-B)+dj{B)<ii(A-B)+<ij(B)dj{A-B)+<ij(A)) 

~ n ^ XI ^^o)^^b + ^ mjitjb] 

beA j j 

sums over j ^i, and B < A (55) 

Hence, wlicn tliere is no gametic disequilibrium prior to one round of repro- 
duction + migration d^^^ reduces to: 

<^™A) = i'^-^^i3)(^i{A)+^mjiej(A)-J\[{'^-^mij)tib + ^ (56) 

j beA j j 

To expand the productore, let us note that 

(toi + ni)(m2 + "2) = TOim2 + min2 + nim2 + nin^ 

(mi + ni){m2 + n2)(m3 + na) = mim^mz + m\n2mz + nim^mz + n\n2mz 

+mim2n3 + min2n3 + nim2n3 + nin2n3 
= 17111712171^ + m2m^ni + TOim3n2 + mim2n3 
+m3nin2 + 7711712113 + 7712/11713 + 711712713 

This can be generalized to: 

{77lb + 716) = ^ 7716 rZc = ^ 7716 ]J (57) 

beN C beH ceK H<KbeH ceN-H 

where C is the condition expressed hy H U K — N and H Ci K = ^. 
Making 7776 = (1 — Ej 'mij)tib and Tib = Y1 mjiijb we have: 

^T{A) = HbeAKl - Ej "^u)iib + Ej mjitjb] 

— Y1h<a Ylbeni^ ^ Ej mij)tib ricGA-ff (Ej mjitjc) 

= E/7<a(-'^ ~ Ej ''^ij)'^^ HbeH ^ib riceA-ff (Ej mjiijc) 

e"^) = E(i-E"^^^)*%^) n (E™^«*i-) (58) 

j cGA-ff j 

To expand nc6A-i/(Ej iT^jitjc) > we can generalize (57) to 
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H (X^ rrijitjc) = XI n TO2ii2c--- rriLitLcL (59) 

where C means: ci S i^i, C2 G -K'2r- •• cj, e Kl with Uii'i = ^4 — if and 

K.nKj = if ?: 7^ j. 

Equation (59) can be elaborated to 

n (E"^^'*i'=) = E"^S'''"^2*''^---m*f^ei(i^i)e2(/^2)....eL(i^i) (60) 

ce^-H j c 

Turning back to (58) and them to (56), we get that the gametic disequilib- 
rium created by migration when there was no gametic disequilibrium prior to 
one round of reproduction + migration is: 

j 3 

- E (1 - E™'^)*''e^W E™^'''™2*''^-m?f (61) 

H<A j C 

Our point in that d^{A) in (61) is in general expected to be different than 
zero. Therefore, we conclude that migration can create gametic disequilibrium 
from zero. 

12 CONCLUSION 

Mating among diploid individuals can be reduced to mating among haploid ga- 
metes in a common reservoir. Evolution of gamete frequencies under any scheme 
of recombination can be calculated. The system has fixed points depending on 
initial conditions, determined by the so called marginal probabilities. There ex- 
ists a measure of gametic disequilibrium, i.e., a function that relates the actual 
frequencies to those of equilibrium, the Bennet measure, whose recurrence equa- 
tion has the same form as the recurrence eqiiation of gamete frequencies when 
they are written in the system of the so called transformed frequencies. Mi- 
gration has the same form for individuals, gametes, transformed and marginal 
frequencies. Migration alone can create disequilibrium from zero. 

13 TO KNOW MORE 

• The theory has been extended (Christiansen, 1999). 

• How to simulate recombination in Java with or without bitsets (Rodriguez, 
2009). 
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• The mathematical theory of the genetics of populations is a well developed 
discipline. We have classics (Crow and Kimura, 1970) and modern views 
(Christiansen, 2008). 
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15 Glossary 

Alleles Variant forms of genes occurring at the same locus are said to be 
alleles of one another. 

Amphimictic That use both sexes in reproduction. 

Bit Unit of information that corresponds to a yes else no answer. A bit is 
usually encoded by 1 else 0. 

Bitset A set that represents a binary number. 

Chromosome Tiny rods in the cell that carry the genetic information. 

Deme A subpopulation that enjoys more or less identity and independence. 

Diploid An organism whose cells contain chromosomes by pairs, one from 
the mother and one from the father. 

Disequilibrium measure A function that calculates the difference between 
actual state and that of equilibrium. 

Gamete Sexual cell able to unite with other in reproduction. Sexual cells 
have only one version or allele of the genetic information. Here, a gamete 
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is represented by a binary number to inform, say, whether in each locus the 
information comes from the mother else from the father. 

Gene The inheritable information that encodes for certain property. A 
portion of DNA that encodes for a protein or an enzyme or a part of it (an 
enzyme is a protein molecule that selectively accelerates a reaction in the cell). 

Haploid A cell that contains only one copy of genes, as gametes. 

Homozygote The quality of a diploid individual of having in the two chro- 
mosomes the same information for a given locus. 

Heterozygote The quality of a diploid individual of having in the two 
chromosomes two different versions for a given locus. 

Linkage The quality of being together. In nature, when two genes are close 
one to another, recombination that separates them is less probably. 

Locus Here, the specific position occupied by a bit in a binary number that 
represents a gamete. In biology: the specific position occupied by a gene in a 
chromosome. 

Loci The plural form of loci. 

Meiosis Cellular division that at the end gives rise to gametes. 

Mendel's law Working with pea plants in the garden of his monastery, 
Grcgor Mendel made the first model of genetics: First law: Reproduction results 
from fusion of gametes and gametes can carry only one type of inheritable 
information while organisms may carry two. Second law: for two characteristics 
the inheritable factors are inherited independently. The first law is correct for 
diploid organisms, the second is correct only when the factors are not linked 
because of physical closeness. In the present work, both laws of Mendel are 
assumed to be true. 

Panmictic population One that has no reproductive barriers or biases 
with respect to random mating. 

Parthenogenesis Optional reproduction of females without the cooperation 
of males. 

Recombinant operators A function that to each pair of gametes associates 
a third one as a result of recombination. 

Recombination A process in which a new combination of alleles is formed 
beginning from two previous ones. 

Zygote The cell that results from the fusion of the ovule and spermatozoon. 
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